oplogPopulator: synthesise oplog key via $addFields in connector pipeline#2744
oplogPopulator: synthesise oplog key via $addFields in connector pipeline#2744delthas wants to merge 2 commits into
Conversation
Hello delthas,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files
... and 8 files with indirect coverage changes
@@ Coverage Diff @@
## development/9.5 #2744 +/- ##
===================================================
- Coverage 74.73% 74.60% -0.13%
===================================================
Files 199 200 +1
Lines 13650 13661 +11
===================================================
- Hits 10201 10192 -9
- Misses 3439 3459 +20
Partials 10 10
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
1193adb to
20249e5
Compare
4edd41b to
e24ee0a
Compare
e24ee0a to
37fb5a1
Compare
francoisferrand
left a comment
There was a problem hiding this comment.
will the recently updated logic correctly update the pipeline in all cases?
or do we have some situations were we may still stay with an old pipeline?
37fb5a1 to
e0ce22c
Compare
Yes, on the next Backbeat restart, no existing connector stays on the old pipeline:
The one path that doesn't flip in-process: an already-running Backbeat with old code in memory — the in-process Also addressed in
|
e0ce22c to
bf8bafd
Compare
bf8bafd to
def1be2
Compare
…line The current Kafka message key projects fullDocument.value.key, which is null on update events since BB-355 removed change.stream.full.document=updateLookup. Every update for a given bucket therefore serialises to the same key Struct ({ns:{coll:<bucket>}, fullDocument:null}) and lands on one partition, breaking per-object ordering across op types (insert vs update) and pinning a hot bucket's update traffic to a single partition (a blocker for the oplog scaling work tracked in BB-756). Fix: insert a $addFields stage into the MongoDB connector's change-stream pipeline (right after $match, before the conditional location-strip $set) that synthesises a top-level 'key' field by $ifNull coalescing $fullDocument.value.key and $updateDescription.updatedFields.value.key. The connector's output.schema.key replaces the broken fullDocument nested record with a sibling 'key' field; the existing 'ns:{coll}' projection is preserved, so the Kafka message key remains {ns:{coll:<bucket>}, key:<object-key>} — bucket-level isolation is unchanged. All events for the same logical S3 object (insert, master/version updates, replication-status updates, delete-marker updates) yield the same key value and hash to the same partition. Master and version documents share the same value.key, so master/version events also collapse to the same partition without prefix-stripping. The $ifNull variant relies on metadata always writing the whole 'value' subdocument (full $set). Confirmed against arsenal: there is no partial dotted-$set path for object MD today. A hypothetical future partial $set would not populate updateDescription.updatedFields.value.key and that event would mis-partition — accepted risk per the ticket discussion. The change is propagated to existing connectors via the existing in-place PUT /connectors/{name}/config reconciliation (no recreate, no resume-token loss — the change touches the key schema + the added $addFields stage only, not the $match stage). Downstream oplog consumers do not read the Kafka message key, so the new key shape is consumer-transparent. History / discussion: https://scality.atlassian.net/browse/BB-768?focusedCommentId=477122 This supersedes the earlier SMT-based approach (a Kafka Connect Single Message Transform deriving the key from documentKey._id with master/version prefix-stripping), which would have spanned three repos and added a new Java artifact + operator feature flag. Closed in favour of this single-repo fix after a per-event-cost measurement showed the $addFields adds ~600–900 ns/event ≈ ~1–2% of one core at 20k ops/s on the mongod — not a throughput concern at our target rates. Superseded work (to be closed): #2741 (SMT-track Backbeat PR) scality/Zenko#2410 (ZENKO-5274 — Java SMT in kafka-connect image) https://scality.atlassian.net/browse/ZKOP-553 (operator feature flag — no longer needed) Issue: BB-768
def1be2 to
3869977
Compare
…sed key Adds a Prometheus counter `s3_oplog_event_missing_key_total` (labelled by operationType) that increments when a consumed oplog event reaches downstream processing in ListRecordStream but lacks the top-level 'key' field synthesised by the connector pipeline in the previous commit. Should stay at zero in steady-state. A non-zero rate signals a regression — e.g. a future write path on object MD that doesn't $set the whole 'value' subdocument and so bypasses the $ifNull coalesce. Wired as a new `KafkaLogConsumerMetrics` module mirroring the LifecycleMetrics pattern (static class, per-method try/catch + handleError so a prometheus-side failure can't propagate into the oplog read path). Per-PR review feedback. Issue: BB-768
3869977 to
f53649e
Compare
|
LGTM — clean, well-tested change. The $addFields key synthesis is correctly placed in the pipeline, the Avro key schema flattening is consistent, the $ifNull coalescing handles insert/update/delete event shapes properly, and the missing-key metric provides good observability for the coupling caveat (partial dotted $set regression). Pipeline validation logic (extractBucketsFromConfig, isValid) only inspects pipeline[0], so the new stage at index 1 is transparent. Test coverage is thorough across unit and functional layers. |
|
@francoisferrand : moving this back to 9.4 since 9.4 == 9.5 atm and not sure this warrants a different branch. |
Request integration branchesWaiting for integration branch creation to be requested by the user. To request integration branches, please comment on this pull request with the following command: Alternatively, the |
Summary
Two commits on this branch:
oplogPopulator: synthesise oplog key via $addFields in connector pipelineextensions/oplogPopulator/pipeline/PipelineFactory.js: insert a$addFieldsstage into the connector's change-stream pipeline (right after$match, before the conditional location-strip$set) that synthesises a top-levelkeyfield via$ifNullcoalescing$fullDocument.value.keyand$updateDescription.updatedFields.value.key.extensions/oplogPopulator/constants.js: replace the brokenfullDocumentnested record inoutput.schema.keywith a top-levelkey: [string, null]field. The existingns: {coll}projection is preserved, so the Kafka message key remains{ns: {coll: <bucket>}, key: <object-key>}— bucket-level isolation is unchanged.connectorConfigfixture used byConnectorsManagertests.queuePopulator: counter for oplog events processed without a synthesised keylib/queuePopulator/KafkaLogConsumer/KafkaLogConsumerMetrics.jsmodule (mirrorsLifecycleMetrics: static class, per-methodtry/catch+handleError).s3_oplog_event_missing_key_totallabelled byopType, incremented fromListRecordStream._transformwhen a consumed event passes theobjectMdcheck but lacks the top-levelkey. Defensive observability — added per review.Pure Backbeat change. No SMT, no Zenko image change, no operator flag. Always on.
Context
BB-768: the current key schema projects
fullDocument.value.key, which isnullonupdateevents (since BB-355 removedchange.stream.full.document=updateLookup). Every update for a given bucket therefore serialises to the same key Struct ({ns:{coll:<bucket>}, fullDocument:null}) and lands on one partition — both an ordering problem (insert → update on different partitions for the same object) and a throughput problem (a hot bucket's update traffic can't scale with partition count, blocking BB-756).After ticket discussion (Jira comment 477122) we picked the pipeline-
$addFieldsroute over a Kafka Connect SMT: it's a single-repo fix, and the measured server-side cost is ~600–900 ns/event ≈ ~1–2% of one core at 20k ops/s — not a throughput concern.For the same logical S3 object — insert (key in
fullDocument.value.key), master PUT update (key inupdatedFields.value.key), replication-status update, delete-marker update — all yield the samekeyvalue → same partition. Master and version documents both store the samevalue.key, so master/version events also collapse to the same partition without prefix-stripping.Coupling caveat
The
$ifNullvariant relies on metadata always writing the wholevaluesubdocument (full$set). Confirmed today against arsenal — there's no partial dotted-$setpath for object MD. A future partial update ($set: {"value.x": …}) would not populateupdatedFields.value.keyand the resulting event would mis-partition. Accepted risk per the ticket discussion. Thes3_oplog_event_missing_key_totalcounter exists to detect exactly this regression.Migration
The change is propagated to existing connectors via the existing in-place
PUT /connectors/{name}/configreconciliation (no recreate, no resume-token loss — the change touches the key schema + the added$addFieldsstage only, not the$matchstage). Downstream oplog consumers don't read the Kafka message key, so the new key shape is consumer-transparent. See this comment for the full migration trace.Issue: BB-768